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1 .  Introduction 


Let  X, ,  n  >  2  be  i.i.d.  random  variables  with  common 
12  n  - 

distribution  function  F.  Let  h  be  a  symmetric  function  of  r  variables 

such  that  h(X^,...,X^)  has  mean  zero  and  such  that  E[h(Xj, - ^^1=  g(X^) 

has  a  positive  variance.  Hoeffding  [11]  introduced  the  U-statistic 


H  =  0  I  ), 

"  i6C  H 


where  /  denotes  summation  over  the  set  C  of  combinations  i  =  i,,...,i  of 
ifiC  -  1  r 

integers  in  {l,2,...,n}.  Hoeffding  proved  the  asymptotic  normality  of 

U-statistics.  An  investigation  of  the  rate  of  convergence  to  normality 

begun  by  Grams  and  Serfling  [9]  and  continued  by  Bickel  [1]  and  Chan  S 

Wierman  [4] ,  resulted  in  the  Berry  Esseen  theorem  for  U-statistics  by  Callaert 

and  Janssen  [3].  They  obtained  the  rate  of  convergence  0(n’"**)  assuming 

a  finite  absolute  third  moment  for  the  kernel  h(X, ,...,X  ).  necontl'',  Helmers 

1  r 

and  Van  Zvret  [13],  for  the  case  of  r=2,  have  relaxed  the  assumption  to 
E|g(X^)  1^  <  00  and  E|h(X^,X2)  |*^  <  «>  for  some  t  >  5/3. 

For  a  symmetric  function  w(ij^, —  ,i^)  on  (I^)  ,  where  =  {l,2,...,n}, 

satisfying  the  condition  that  w(ij^, . . .  ,i^)  =  0  if  i^  =  ij^  for  some  j  3^  k, 
we  define  the  weighted  U-statistic 


U  —  yw(i^,...,i)h (X.  ,...,X,  )• 

"  ifC  1  "  ‘l 


Little  is  known  concerning  the  asymptotic  properties  of  such  statistics, 
as  noted  by  Serfling  [15]  .  For  kernels  of  degree  r=2.  Brown  and  Kildoa  [2] 
considered  statistics  of  the  form  S  =  7  h(X.,X.),  where  k  is  fixed 

and  for  each  n,  C  is  a  collection  of  pairs  (i,j)  with  1  <  i  <  j  <  n 
balanced  in  such  a  manner  that  each  positive  integer  less  than  or  equal  to  n 


is  present  in  exactly  2K  pairs  in  C  .  These  statistics  are  called  balanced 

1C  f  n 

incomplete  U-statistics  or  reduced  U-statistics,  and  are  clearly  a  special 
case  of  the  weighted  U-statistic  with  weights  of  0  or  1  only.  Brown  and 
Kildea  show  that  s^,  properly  standardized/  is  asymptotically  normal.  Estimates 
based  on  reduced  U-statistics  are  asymptotically  equivalent  to  those  based  on 
the  corresponding  U-statistics,  but  require  far  fewer  steps  to  compute.  Brown 
and  Kildea  also  obtain  asymptotic  normality  in  some  cases  when  the  balancing 
condition  is  relaxed. 

Sievers  [17]  considered  the  simple  linear  regression  model  =  a  +  + 

e^,  1  <  i  <  n,  where  a  and  B  are  unknown  parameters,  Xj,...,x^  are  known 
regression  scores,  and  e^,...,e^  are  i.i.d.  random  varieibles.  He  considered 
inferences  for  3  based  on  a  weighted  rank  statistic  defined  by 

n-1  n 

T/,  =  y  y  a.  .(j)(Y.  -  a  -  6x. ,  Y.  -  a  -  Bx  ) 

B  .  V,,  ID  1  I  D  3 

1=1  D=i-+1- 


where  (()(u,v)  =  1  if  u  <  v  and  0  if  u  >  v.  The  weights  are  arbitrary,  except 

that  a..  =  0  if  x.  =  x..  Note  that  the  when  the  slope  parameter  has  value  B* 
ID  1  D 

then  T_  is  a  weighted  U-statistic.  Sievers  proved  asymptotic  normality  of 
P 

Tg  under  restrictions  on  the  weights  ^nd  developed  tests  and  confidence 

intervals  for  the  value  of  the  slope  parameter  B  based  on  T^. 

Shapiro  and  Hubert  [16]  consider  weighted  U-statistics  with  kernels  of 

2 

order  r=2,  and  proved  asymptotic  normality  if  E[h(Xj^,X2)  ]  <  <»  and 


V  2,  ^2 

ID  k*n 

i?<D  k=l 


and 


2  , 
max  w .  / 

l<i<n 


n 


1  w 

k=l 


2 

k-n 


^  0 


i 

ti 


1 


where  w.  =  >  w.  ..  This  result  is  then  used  to  obtain  asymptotic  normality 

x.n  13 

of  permutation  statistics  of  interest  in  biometry  (Mantel  and  Valand  [14] ) , 
geography  (Cliff  and  Ord  [5])  and  clustering  studies  (Hubert  and  Schultz  112). 

Kepner  and  Robinson  [13]  considered  weighted  sums  of  multivariate  functions 
with  kernel  of  order  k,  and  generalized  the  asymptotic  normality  results  of 
Brown  and  Kildea  [2]  and  Shapiro  and  Hubert.  Note  that  the  results  of  ;hese 
papers  and  the  present  paper  are  valid  when  the  kernel  h  and  weight  f  :tion 
w  are  replaced  by  sequences  h^  and  w^^  satisfying  the  conditions  assu 
Let  U  =  T  w(i,,  —  ,i  )h(X.  - - ,X.  ) 

"  ilc  ^  r 

where  ^  denotes  the  sum  of  over  all  combinations  ^  =  {i^,...,i^}  of 

C 

integers  from  {l,2,...,n}.  Introduce  the  function  g  by  g(X  )  = 

1 

E[h(X.  ,...,X.  )|x.  ],  and  the  sums  of  weights 
^1  ^r  ^1 

=  I  =  I  ‘"I  w(i,i2,...,ij.) 
i  ^2  ^r 

{ 1 , 12 » • •  •  ^  ^ 

and  w.  .  =  y  w(i)  =  w(i,  j ,i  , . .  . ,  i  ) . 

i9i,3  I3  1^ 

{i, j ,i3, . . . ,i^}  €  C 


Let  r  = 


=  y  w^,  s^  =  I  and  t  =  ^  ^  wf . , 

"  iii "  "  iii "  ^  i"i  j=i+i 


The  projection  of  is  given  by 


■  J, 

1=1 


n 

^  I  I  w(i  ,...,i  )E[h{X  ,...,X^  )|X^] 
i=l  igi  1  r 
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or  alternatively 


“n  '  j  -  I  {»»! . V  ,i  . X,  )|X.)? 

1-1  i^C  ^  i=l  1  r  ^  J 


=  I  w(i  , . . . ,i  ) [g(X  +...+g(X.  )] 

i€C  ^  ^  \  ^r 


2  2  2 
Let  a  =  Var  (g(X.)),  O.  =  Var  {h(X,  ,...,X  )),  and  a  =  Var(U  ).  Calculate 
g  in  1  r  n  n 

2  ^  n  n 

a  =  Var(U  )  =  y  w.  Var(g(X.))  =  a  T  wT. 
n  n  1  1  g  .  i 

1=1  ^  1=1 


Three  conditions  on  the  weights  are  required  for  the  statement  of  the 
result. 

Condition  (1) :  There  exists  B  <  1  for  which 


max  w^  <  (Bs^/r) A(sVr^)  for  all  n  >  r  +  1 

1  -  n  n  n  — 

l<i<n 


^  ij  .  1  -1  11/3  -9  ..  ,  „-l  5 

Condition  (2):  - n  r  s  [t  log (0,0  s  r  t  )} 

2-3  n  n  n  hgnnn 

max  w^ 

Condition  (3);  t  <  Cr  /s  for  some  C,  0  <  C  <  <». 
n  -  n  n 

Theorem:  If  h(X^,...,X^)  has  finite  absolute  third  moment  and  the  weights 

satisfy  Conditions  (1),  (2),  and  (3),  then 

I  P  (x)  I  =  O  (r^/s^)  ,  as  n  ->  00. 


The  most  restrictive  condition  on  the  weights  is  Condition  (2),  which 
is  derived  from  the  characteristic  function  bounds  in  the  Fourier  analytic 
approach  to  the  Berry-Esseen  result.  With  Conditions  (1)  and  (2)  satisfied, 
the  theorem  holds  for  U^/o^.  Condition  (3)  permits  replacement  of  o  by 

a  . 

n 


-4- 


The  present  paper  generalizes  the  result  of  Callaert  and  Janssen  [3] 


since  if  w(_i)  =  1  for  all  i,  conditions  (1),  (2),  and  (3)  are  satisfied, 

3  -H 

and  in  this  case  the  rate  of  convergence  is  0(r^^/s^)  =  0(n  ).  For  the 

case  of  unequal  weights  satisfying  0  <  A  <  w(^)  <  B  for  all  the  Theorem 

-*5 

applies  and  provides  an  0(n  )  rate  of  convergence.  In  fact,  one  sufficient 

-*S 

condition  for  the  convergence  rate  o{n  )  is 

max  s.  . 

(*)  ^ ^  <  B, 

min  w. .  - 

which  holds  for  the  above-mentioned  cases  in  this  paragraph.  One  may  obseirve 
from  Conditions  (1) ,  (2)  and  (3)  that  the  bound  on  the  convergence  rate 
depends  on  the  weights  only  through  their  sums  w^ ^ ,  so  individual  weights 
may  differ  greatly  without  violating  the  hypotheses  of  the  Theorem. 


2,  Proof  of  Theorem 


Denote  (U  -U  ) /O  by  A  .  Note  that 
n  n  n  n 


A  =a  y  w(i,...,i)Y. 

"  ilc  ^  ^ 


where  Y.  .  =  h(X.  ,  —  ,X.  )  -  g(X.  )- 

1 .  r  h  ^r  ^1 


-g(x^  ) 

r 


Split  A  into  two  parts  A'  and  A"  —  A  -  A’,  with 
n  n  n  n  n 

^n 

A'  =  y  y  •••  y  w(iw...,i  )y.  )y.  ,., 

n  .  .  'r  .  •  •  1  r  1,  1, 

1,=1  1_=1t+1  1  =1  +1  11 

1  2  1  r  r 


Restrictions  on  the  choice  of  c^  are  found  which  provide  the  rate  of 
3 

convergence  0(r  /s  )  for  bounds  on  several  terms  to  be  estimated.  Condition 
n  n 

(2)  insurns  the  oxistcnce  of  a  choice  of  c  which  satisfies  all  of  these 

n 

ro.strictions  simultaneously.  [This  corri;sponds  to  the  analysis  of  order  bounds 


for  c  arid  d  iri  section  3  of  Callaert  and  Janssen  [3].] 
n  n 


For  any  sequence  of  constants,  an  elementary  calculation  gives 


sup)p(U  /a  <  x)  -  $(x) 
X  n  n  - 


<  sup|p(S  +  A'  <  x)  -  $(x) I  +  P(|A"I  >  a  )  +  0(a  ) 
-vO  n-  '  'n‘-n  n 


Then,  letting  (|)  denote  the  characteristic  function  of  random  variable  X,  for  x>0. 

es3/r  -  .2  ■_ 

^  n  n 

<  ^  (t)  -  ‘*»g  ^^.(t)Idt. 


0 


0 


n 


n  n 


Since  is  a  sum  of  independent  random  variables  with  finite  absolute  third 
moments,  a  standard  Berry-Esseen  argument  (see  e.g.  Feller  {8],  p.544)  yields 


£s^/ir  2 

/  ■>  4,^  (t)  |dt  <  C,V,o-='  r  a" 

0  -  1  3  q  n  n 


n 


for  an  absolute  constant  where  =  E  |h(X^, . . .  ,X^)  P,  and  we  may  take 


e  =  Y^’b- 


The  majority  of  the  proof  determines  the  bound  for  the  remaining 


integral.  Writing  p  for  the  characteristic  function  of  with  e  as 


above ,  we  have 


I  n  ( 0)  I  <  e“  3  ® 


for  0|  <  eOg^- 


Begin  by  estimating 


n  n  n 

|E[c^*'®'^(l-e^^^n)]  I 


<  |E[e 


,itSn  I  ^  1  t-E(A-)", 


n'  '  2 


and  note  that  by  independence 


n 


itd 


-1 


y  w(i  )E[e  I  w 


i£C 


kj^ijVj 


X  E 


eit5n'^«i.g(Xi  )+-.-+Wi  g  (X^  )) 


r"  r 


For  a  fixed  combination  £=  {i^,...,i^},  assuming  Condition  (1)  holds,  for 


0  <  t  <  e  s  /r  , 
n  n 


itOn^  r 

®  I 

k=i  .  j 
J 


=  n 


/  I 

<  e  3  (kj^i^,. 

^ (1-B) 

=  e"  3 


• /i^ 


2\  2 
w,  \0 
i  X)  g 

•'^r 


since 


0  <  t  <  es^/r  <  es  /max  w.  implies  w  for  all  k, 

nn  n-,..i  K  -  g 

l<i<n 


Also,  for  each  fixed  since  E[f(X.  )y.  ]  =  0  for  any  bound  Borel  measurable 

^1  1 

function  f. 


,it^n  I  ^i.g(Xi  ) 

j=l  ^  3  ij,...,i^ 

I  .  -1  V 

J  -  1 


^  =  1  -  itO  w  g(X.  )>  Y. 

^  ^  ^  ’  "  j-i  'j  J  . ^ 


-7- 


^  I  "'i  E|g(X.  )g{X  )Y  .  I 

i-i  H  j  k  j  \ 


<  (r+i)v,t^a  f  y  w. 

3  n  1 


j=l 


Combine  these  estimates  to  obtain 


1  Ee^^^nA. 

'  n 


<  (r+l)v38^Ve“  3  I  {„(i 

i.<c  ^ 

iec 


. ,  i  )  (w.  +. .  .+w.  ) 

IT  X  -  i 

1  r 


To  bound  this  sum,  write 


V  •  >2  2 

l  » •  •  • '  i  )  [w.  +.  .  •+w.  +  2w.  w.  +...+2w.  w  1, 

^  ^1  ^1  ^2  Vl  "-r 


First, 


y  w(i  ,...,i  )w 
i£C  ^ 


=  I  {wj  I  w(i  ,...,i  )} 


-  I  \ 

*  —1 


il=l  1 


and  similarly  for  each  of  the  squared  term's  contributions.  For  the  cross- 


product  terms,  by  Holder's  inequality. 


. 


m 


o 


t] 

♦ 
r 

Am 


9S9 

a 


h 


ll  1  I2  1  2 

i€C 


)  )  w.  w.  w.  . 

^  ^  1X11 

il=li2=i^ll  2V2 


V  V  2  _  r  r  ?  ._2  ..  1*3 


<  I  I  W.W..  J^w.'w.. 

—  ^  Ij  111  ^  ^111 


<  I 
-  .  1 


There  are  (^)  cross-product  sums,  each  with  a  coefficient  of  2,  so  combining 


the  bounds,  the  overall  sum  is  bounded  by 


1 

i=l  ^ 


Hence  for  d  <  ^ —  , 
n  -  r 

n 


A  t  ^  |e  ]|dt 


r2(rfl)V3  n  d 

- 4  t  e - -  dt 

a  i=i 

n 


r^(r+l)v  r 

i  ^  I  (1-3,-^^^ 

a  s 

g  n 


indGp0ndGnt.  of  the  choice  of  d  .  Note  also  that  the  choice  of  played 

2 

no  role  in  the  computation  of  this  bound.  To  bound  e[(A^)  J,  note  that 


E[Y.  .  Y  i  ^  ° 

1^,  ...  ,1^ 


if  the  combinations  i  and  j_  contain  zero  or  one  common  indices.  [See  Grams 
and  Serf  ling  ['9]  .  Otherwise, 


E[Y.  .  Y.  1  <  (r+1)^ 


Then  S[ (A’ )^] 

n  n 


=  E  ^  'l  w(£)w(2)Y.Y. 

i€C  j€C 

=  I  w(i)w(2)  E[Y^Y  ] 
*1  -i  -J 


i'l 

in  j  |>2 


I  I 


i-i=l  i9~ii+^5^ii ' i-i ^  ~ 

1  ^  X  I  X  z  X  A 


=  ir.U^ol  II  wj  . 

*  ^  ,•  _  1  -•  _  .  T  J-  ,  i  , 


,  I  .2  2 

(r+1)  O,  t  ■ 
h  n 


The  choice  of  d  id  determined  by  the  bound  for  E (A ' ) 


Choosing  d  =  r 
^  n  ri 


the  bound  becomes 


1  2 

2  ^ (A' )  /  t  dt 
2  o 


<  ^(r+l)^afa  ^  t  d^ 
h  n  n  n 


<  ^(r+l)^a^a  ^  r  /s^ 
h  g  n  n 


yUk— .  fe 

The  estimates  above  provide  the  bound  required  for  ~  ^(X)  |  for 


all  n  such  that  e  s  /r  <  d  . 

n  n  -  n 

3 

For  the  case  when  d  <  r  /s  ,  write 

n  n  n 


its,,  itA'  I 

E  e  (1-e  ) 

I  .  .  ft -1  r 


n  E  e  -  n  (1-e 


=  E[e  „  J||E[e 

<  A'  3  k>l„“k''  'A 


-  T  Cf  )  Wj^t  a 

•J  II  ^ 


<  t  E  |A'|  e 


The  bound  for  e|a' |  is  obtained  from  Lyapunov's  inequality  ([6],  p.  47) 

2 

and  the  previous  bound  for  E(A')  : 


eIA'I  <  E*5(A')^  <  {r+l)a  a 
'  '  -  -  h  n  n 


Choose  c  so  that 
n 


r’  7  -1  -2  ,-15  -2  5}  , 

(*)  y  w  >  3a  a  d  log  (a  0  S  r  t  ). 

k>c 

n 


es^/r 

|dt 


SSSSSSR 

8 

wmmm 

<(r+l)  /f  3  J  at 


<  (r+1) 


T  5  C:^  I  2 

"  "  ^  k>cn\ 


=  e(r+l) a  a~^r  /s^  . 

h  g  n  n 


Note  that  inequality  (*)  is  satisfied  if 


2  _2  -1. 


(n-c  )  min  >  3a  ^s  a^r  -^t  log  (a  a 

"  ^  ~  gnnnn  ^'hg  n  n  n^ 


l<i<n 


providing  a  lower  bound  for  n-c^  to  be  used  later  in  the  proof. 
To  handle  A",  define  ^ ^  by 


■  .  I  ,  fj,  “iV  '  I  Cy 

3"=n+l  i33  -  -  J-c  +1  3 


Since  ^tC-;+n  |Ci'  i  <  j  ]  =  0  a*s.  for  all  j,  the  are  martingale  summands, 

Cn+k  ^ 

and  by  optional  skipping,  Vj^  =  I  K-  forms  a  martingale,  k  =  1,2,..., n-c  . 

j=Cn+l  ^  " 

By  a  theorem  of  Dharmadhikari,  Fabian,  and  Jogdeo  17],  for  k  =.n-c^. 


e|v  <  2^^  (n-c  max  eI^.I"^. 

'  n-c '  -  n  ,  .  '  1 ' 


c^+l<j<n 


However,  for  fixed  j  >  o^+l. 


~  I  ^  I  '^i^i^  '  k-l,2,...,j  1 
i=l  i3j,i  - 


is  also  a  martingale.  A  second  application  of  the  theorem  of  Dharmadhikari, 
Fabian,  and  Jogdeo  [7]  yields 
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I 


Now 


;(CJ^  =  e|w  P  <  2^^(j-l?^^max  e|  I  w  y  (- 

^  ■  l<i<j-l  i9j,i-- 

;i  y  w.Y.P  <  y  y  y  w.  w.  w.  e  y.  y.  y. 

i33,x - 

<  (r+i)  Vo  y  y  y  w.  w.  w. 

-l'-2'-3  -1  -2  -3 


=  {r+l)\^ 


Therefore, 


e|A"|^  <  2^^  (r+1)  (n-c^)  [max 


1/3 


By  the  Markov  inequality, 


P(|A"|>  a  )  <  a“^E|A"P. 

-  n  -  n 

Taking  a  =  [ (n-c  ^  max  w?.]^  yields 

n  L  n  n  13J 

P(lA"|  >  V  ^  2^^(r+l)\3a^. 


If  c  is  chosen  so  that 
n 


n-c  < 


r8/3 

n 


n  -  -  6  2 

no  max  w.  . 
n  X,] 


then  a  <  — - 
n  -  3 

s 

n 


Finally,  if  both  conditions  concerning  may  be  satisfied 

3  '' 

simultaneously,  the  0(r  /s  )  rate  of  convergence  is  obtained  for  U  /a  . 

n  n  n  n 


This  provides  the  condition 


2  11/3 

max  w.  r  i  c  o  i. 

1/3  In  ^  ,  -15  -2^*s, 

- r —  <  - - - —  t  log  (a  a  s  r  t  ) 

2  -3n9  n  ^hgnnn 


mxn  w. 
X 


n 
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Note  that  the  condition  depends  on  the  weights  only  through  their  sums 


and  w . . . 
ID 


To  replace  O  by  O  ,  note  that 
n  n 


Var(U^)  =  I  E[h(X^  , . . . ,X^  )h(X.  ,...,X. 

i  j  —  —  1  r  ^1 


)1 


^  ^  w.w.cr^  +  Iw.w.  E[X. 
i  j  ^  3  g  i  1-^ 
in  i  1=1  in  i!>2 


,X  )h(X  ,...,x  )1 

r  ^1 


+  y  yw.w.Eih(x. 

.  %  1  g  f*  f  i  1  1, 

1=1  ^  i  i  ~  1 

in  j  I  >2 


.,X.  )h(X.  ,...,X.  )] 
^r  ^1 


implies  |o^  -0^  1  <  I  Iw.w.  =0^  y 

'n  n  -  hf-fij  h  ^  ''  ii ' 
i  2  -  ^  i  3  D 

i  n  j  I  >2 


Therefore 


O, 


^  -  11  < 


0 

i 

_il_  1 

M 

/N 

0 

0'  ' 

n 

n 

2  ^2 

a  -  a 
n 


2  y  y  2 

^  a,  v  tw,-  . 

<  h  1  1  ii 

2  n  2 

0  )w. 

g  4"  1 


Tf  t  <  c  r  /s  /  then  there  exists  a  constant  K  such  that 


P(o"^U  <  (1-Kr  /s^)x)  <  P(0  <x)  <  P(a  <  (l+Kr  /s^)x) 

nn-  nn  -  nn--  nn-  n  n 

for  all  positive  real  numbers  x,  with  a  similar  inequality  for  x  <  0. 

By  the  assertion  of  the  theorem, 

P(a“^U  <  x) 
n  n  - 

<  P(a"^U  <  (l+Kr  /s^)x) 
n  n  -  n  n 

<  $((1+Kr  /s^)x)  +  Lr  /s^ 

n  n  n  n 

2 

<  $(x)  +  Lr  /s^  +  (2Tr)"V^  ^^Kr  /s^ 

n  n  n  n 

<  ${x)  +  Mr  /s^ 

n  n 

Using  similar  reasoning  for  the  lower  bound,  the  replacement  of 

n  "  n 

3 

IS  shown  to  preserve  the  convergence  rate  r^ 

/®n‘ 


I 


Q  > 


t 


|l 


/»i 
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